Comparison between INTEL Pentium MMX - enhanced microprocessor with AMD K6 MMX microprocessor

Assignment 1

Objectives

 

Report’s Recipient

Steven Lee (Lecturer)

Report’s Author

Neoh Choo Hoo (Student)

Student No.

C04979

Group

A

Date

16th October 1998

 

Research Method: Most of all information is get from Internet

Overview Technical of Pentium MMX

MMX is abbreviation of Multimedia Extension. Intel’s MMX technology is designed to accelerate multimedia and communications applications by a set of 57 new instructions. It also includes new data types.

MMX technology is designed as a set of basic, general purpose integer instruction that can be easily applied to the needs of the wide diversity of multimedia and communications applications. The highlights of the technology are:

SIMD allows many pieces of information to be processed with a single instruction, providing parallelism that greatly increases performance. This technology combined with the Intel ‘s Architecture superscalar architecture will provide substantial performance enhancement to the PC platform, consequently MMX technology maintains full compatibility with current operating systems, such as Windows 98 & 95, MS DOS, OS/2, UNIX and etc.

AMD K6 MMX Technical Brief

AMD K6 MMX processor is an AMD sixth generation processor. K6 MMX brings advanced, six-issue RISC86 superscalar performance to mainstream PCs running both 16-bit and 32-bit code, enabling performance on both the Microsoft Windows 95 and Windows NT operating system, as well the installed base of x86 software. The AMD K6 processor’s used Socket 7 bus interface also allows PC manufacture and resellers to leverage today’s mature, cost-effective infrastructure to quickly bring superior price and performance PC system to market.

AMD K6 MMX is a sixth-generation performance competitive with the Pentium II processor. Used advanced, six-issue RISC86 superscalar microarchitecture, there are seven parallel execution units, multiple sophisticated x86-to-RISC86 instruction decoders, advanced two-level branch prediction, speculative execution, full out-of-order execution and register renaming and data forwarding. AMD K6 MMX also performance highly in IEEE 754-compatible floating-point unit (FPU) and industry standard MMX instructions. It also compatible System Management Mode (SMM). AMD K6 MMX available in Ceramic Pin Grid Array (CPGA) package (Socket 7 compatible) using innovative C4 flip-chip technology. Manufactured using AMD’s state-of-the-art 0.25 or 0.35-micron, five-layer-metal silicon processor at AMD’s Fab 25 wafer fabrication facility.

 

Processor Features

AMD K6 MMX

Pentium MMX

Superscalar Architecture

X

X

High-performance RISC Core

Yes/6 issue (RISC86)

-

Speculative Execution

X

-

Out-of-order Execution

X

-

Data Forwarding

X

-

Register Renaming

X

-

X86 Decoders

2 sophisticated, 1 long, 1 vector

1 sophisticated, 1 simple

MMX

X

X

High-performance Floating Point

X

X

Industry Compatible SMM

X

X

L1 Instruction and Data Cache

32K + 32K

16K + 16K

Execution Units

7

2

Branch Prediction

X

X

Advanced 2 Level Branch Prediction

X

-

Branch Target Cache Entries

16

0

Branch History Table Entries

8,192

256

Branch Prediction Accuracy

95%

75% - 80%

Processor Bus

Socket 7 66MHz

Socket 7 66MHz

Bus Width

64-bit

64-bit

Max. Bandwidth (Mb/sec)

528

528

Latency (smaller is better)

2 clock

2 clock

 

Superscalar Architecture

AMD K6 MMX and Pentium MMX have a superscalar architecture. This architecture is to provide enhanced sixth-generation performance and full x86 binary software compatibility.

High-performance RISC core

Only AMD K6 MMX included this architecture. The intention of AMD innovative RISC86 architecture is to implement the x86 instruction set by internally decoding x86 instruction into RISC86 operations that directly support the x86 instruction set. The RISC86 enables higher processor cores performance and promotes straightforward extensibility in future designs. Rather than directly executing

complex x86 instructions, which have lengths of 1 to 15 bytes, the AMD K6 MMX processor executes the simpler, fixed-length RISC86 opcodes, while maintaining instruction coding efficiencies found in x86 programs. State-of-the-art design techniques include multiple x86 instruction decode, single-clock internal RISC operations, out-of-order execution, data forwarding, speculative execution, and register renaming

x86 decoder

The decoder translates up to two x86 instructions per clock into RISC86 operation. These instructions are:

  1. Short decoder - Decodes the most commonly used x86 instructions
  2. Long decoder - Decodes the semi-common and commonly used instructions
  3. Vector decoder - Decodes uncommon, complex x86 instructions.

The AMD K6 MMX has 2 sophisticated, 1 long, and 1 vector decoder, but, Pentium MMX only has 1 sophisticated, and 1 simple decoder.

Execution units

The AMD K6 MMX processor contains 7 independent execution units, each capable of handling the RISC86 operations.

Load Unit - Performs data memory reads with a two-stage pipeline; data is available in this unit after 2 clocks.

Store Unit - Performs data writes and register calculations with a two-stage

Pipeline; data memory and register writes from stores is available

after 1 clock.

Integer X Unit - Operates on ALU operations, multiplies, divides, shifts and rotates.

Multimedia Unit - executes all MMX instruction.

Integer Y Unit - operates on the basic word and double word ALU operations.

Branch Unit - resolves conditional branches after they have been evaluated.

L1 instruction and data cache

The AMD K6 MMX processor’s write back L1 cache features a separate 32-Kbyte instruction cache and a 32-Kbyyte data cache with two way set associativity. The cache lines are prefetched from main memory using an efficient pipelined burst transaction. The Pentium MMX only contains 16-Kbyte instruction cache and 16-Kbyte data cache, so Pentium MMX will slower than AMD K6 MMX.

Branch Prediction

Branch prediction is designed to minimize or hide the impact of changes in program flow. When a conditional branch is not taken, the processor continues decoding and executing the next instruction in memory. AMD implemented 8,192-entry branch history for AMD K6 MMX, in the other hand; Pentium MMX only has 256-entry branch history. So, AMD K6 MMX processor does not store predicted target addresses to accommodate this large branch history table.

To avoid a one-clock fetch penalty with a branch prediction, a built-in branch target cache supplies the first 16 bytes of instruction directly to the instruction buffer. The branch target cache is organized as 16 entries of 16 bytes. But Pentium MMX does not have any branch target cache.

The benchmark criterias as following:

Mainstream PC Configuration: Winstone 97

 

AMD K6 200 MMX

Pentium 200 MMX

FIC PA – 2011

Gigabyte GA - 586 HX

VP2 chipset

Intel Triton HX chipset

512K L2 cache

512K L2 cache

32MB EDO DRAM

32 EDO DRAM

 

In conclusion, Pentium MMX brings more power to multimedia and communication applications. It also adds new data types and instructions that can process data in parallel. In the other hand, it helps established a new paradigm in the industry with the PC as an improved communication and multimedia devices.

Other side, the AMD K6 MMX is the processor faster, more advanced, easier to use, and more affordable solution than Pentium MMX in same clock speed.